Sequencing and Raw Sequence Data Quality Control ◾ 25
analysis, we may need to filter the reads that have low-quality bases or to trim the ends
of the reads beginning from the 34th base. Figure 1.15 shows a per base sequence quality
graph without warning.
1.5.3 Per Tile Sequence Quality
The per tile sequence quality is represented by a heatmap graph that is available only if an
Illumina sequencer was used and the reads in the FASTQ file retain the original identi-
fiers including the IDs of the flow cell tiles on which reads were sequenced. Figure 1.16
shows the first two records of the FASTQ file “SRR576933.fastq”. Notice that the identifier
line of each record contains the sequence ID, the flow cell ID, lane number, tile number,
x-coordinate and y-coordinate of the tile, and read length.
In the graph, the base position indexes are plotted in the x-axis against the physical posi-
tions on the flow cells (tile numbers) in the y-axis. The base quality is represented by a color
scale from blue (cold) to red (hot). The blue color indicates that the quality of the base from
the tile is at or above the average for the base in the run. The red color indicates the qual-
ity for the base in that tile is worse than the quality for the same base from the other tiles.
The graph provides an easy way to track the average quality scores from each tile across all
FIGURE 1.14 Per base sequence quality with warning.